84 research outputs found

    Is it ethical to avoid error analysis?

    Full text link
    Machine learning algorithms tend to create more accurate models with the availability of large datasets. In some cases, highly accurate models can hide the presence of bias in the data. There are several studies published that tackle the development of discriminatory-aware machine learning algorithms. We center on the further evaluation of machine learning models by doing error analysis, to understand under what conditions the model is not working as expected. We focus on the ethical implications of avoiding error analysis, from a falsification of results and discrimination perspective. Finally, we show different ways to approach error analysis in non-interpretable machine learning algorithms such as deep learning.Comment: Presented as a poster at the 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017

    Learning Machine Learning: A Case Study

    Full text link

    Social shaping of digital publishing: exploring the interplay between culture and technology

    Get PDF
    The processes and forms of electronic publishing have been changing since the advent of the Web. In recent years, the open access movement has been a major driver of scholarly communication, and change is also evident in other fields such as e-government and e-learning. Whilst many changes are driven by technological advances, an altered social reality is also pushing the boundaries of digital publishing. With 23 articles and 10 posters, Elpub 2012 focuses on the social shaping of digital publishing and explores the interplay between culture and technology. This book contains the proceedings of the conference, consisting of 11 accepted full articles and 12 articles accepted as extended abstracts. The articles are presented in groups, and cover the topics: digital scholarship and publishing; special archives; libraries and repositories; digital texts and readings; and future solutions and innovations. Offering an overview of the current situation and exploring the trends of the future, this book will be of interest to all those whose work involves digital publishing

    Detecting ditches using supervised learning on high-resolution digital elevation models

    Get PDF
    Drained wetlands can constitute a large source of greenhouse gas emissions, but the drainage networks in these wetlands are largely unmapped, and better maps are needed to aid in forest production and to better understand the climate consequences. We develop a method for detecting ditches in high resolution digital elevation models derived from LiDAR scans. Thresholding methods using digital terrain indices can be used to detect ditches. However, a single threshold generally does not capture the variability in the landscape, and generates many false positives and negatives. We hypothesise that, by combining the digital terrain indices using supervised learning, we can improve ditch detection at a landscape-scale. In addition to digital terrain indices, additional features are generated by transforming the data to include neighbouring cells for better ditch predictions. A Random Forests classifier is used to locate the ditches, and its probability output is processed to remove noise, and binarised to produce the final ditch prediction. The confidence interval for the Cohen’s Kappa index ranges [0.655 , 0.781] between the evaluation plots with a confidence level of 95%. The study demonstrates that combining information from a suite of digital terrain indices using machine learning provides an effective technique for automatic ditch detection at a landscape-scale, aiding in both practical forest management and in combatting climate change

    Status Quo and Problems of Requirements Engineering for Machine Learning: Results from an International Survey

    Full text link
    Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case studies with limited generalizability. We conducted an international survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems. We gathered 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems involving open and axial coding procedures. We found significant differences in RE practices within ML projects. For instance, (i) RE-related activities are mostly conducted by project leaders and data scientists, (ii) the prevalent requirements documentation format concerns interactive Notebooks, (iii) the main focus of non-functional requirements includes data quality, model reliability, and model explainability, and (iv) main challenges include managing customer expectations and aligning requirements with data. The qualitative analyses revealed that practitioners face problems related to lack of business domain understanding, unclear goals and requirements, low customer engagement, and communication issues. These results help to provide a better understanding of the adopted practices and of which problems exist in practical environments. We put forward the need to adapt further and disseminate RE-related practices for engineering ML-enabled systems.Comment: Accepted for Publication at PROFES 202

    Evaluation of classifier performance and the impact of learning algorithm parameters

    No full text
    Much research has been done in the fields of classifier performance evaluation and optimization. This work summarizes this research and tries to answer the question if algorithm parameter tuning has more impact on performance than the choice of algorithm. An alternative way of evaluation; a measure function is also demonstrated. This type of evaluation is compared with one of the most accepted methods; the cross-validation test. Experiments, described in this work, show that parameter tuning often has more impact on performance than the actual choice of algorithm and that the measure function could be a complement or an alternative to the standard cross-validation tests

    Om den MĂ„ttbaserade Ansatsen till Övervakad InlĂ€rning av Begrepp

    No full text
    A classifier is a piece of software that is able to categorize objects for which the class is unknown. The task of automatically generating classifiers by generalizing from examples is an important problem in many practical applications. This problem is often referred to as supervised concept learning, and has been shown to be relevant in e.g. medical diagnosis, speech and handwriting recognition, stock market analysis, and other data mining applications. The main purpose of this thesis is to analyze current approaches to evaluate classifiers as well as supervised concept learners and to explore possible improvements in terms of alternative or complementary approaches. In particular, we investigate the metric-based approach to evaluation as well as how it can be used when learning. Any supervised concept learning algorithm can be viewed as trying to generate a classifier that optimizes a specific, often implicit, metric (this is sometimes also referred to as the inductive bias of the algorithm). In addition, different metrics are suitable for different learning tasks, i.e., the requirements vary between application domains. The idea of metric-based learning is to both make the metric explicit and let it be defined by the user based on the learning task at hand. The thesis contains seven studies, each with its own focus and scope. First, we present an analysis of current evaluation methods and contribute with a formalization of the problems of learning, classification and evaluation. We then present two quality attributes, sensitivity and classification performance, that can be used to evaluate learning algorithms. To demonstrate their usefulness, two metrics for these attributes are defined and used to quantify the impact of parameter tuning and the overall performance. Next, we refine an approach to multi-criteria classifier evaluation, based on the combination of three metrics and present algorithms for calculating these metrics. In the fourth study, we present a new method for multi-criteria evaluation, which is generic in the sense that it only dictates how to combine metrics. The actual choice of metrics is application-specific. The fifth study investigates whether or not the performance according to an arbitrary application-specific metric can be boosted by using that metric as the one that the learning algorithm aims to optimize. The subsequent study presents a novel data mining application for preventing spyware by classifying End User License Agreements. A number of state-of-the-art learning algorithms are compared using the generic multi-criteria method. Finally, in the last study we describe how methods from the area of software engineering can be used to solve the problem of selecting relevant evaluation metrics for the application at hand.Den tekniska utvecklingen har förĂ€ndrat vĂ„r livsstil och utökat den globala ekonomins fokus frĂ„n produktion av varor till insamling och förĂ€dling av information. En konsekvens av denna förĂ€ndring Ă€r att vi blir mer beroende av databaser för lagring och databehandling. Antalet, och speciellt storleken pĂ„, databaserna vĂ€xer snabbt, vilket gör det allt svĂ„rare att extrahera anvĂ€ndbar information. Tekniker och metoder frĂ„n s.k. informationsutvinning (Eng: data mining) har visat sig vĂ€l lĂ€mpade för denna uppgift sĂ„vĂ€l inom industri som inom flera vetenskapliga och tekniska omrĂ„den. Informationsutvinning, eller kunskapsupptĂ€ckande, Ă€r ett tvĂ€rvetenskapligt omrĂ„de med anknytning till artificiell intelligens, statistik, databasteknik och datorsystemteknik. OmrĂ„dets syfte Ă€r att utveckla kunskap och metoder för att kunna extrahera anvĂ€ndbar information frĂ„n stora mĂ€ngder data. En vanligt förekommande uppgift Ă€r att utvinna information som kan anvĂ€ndas för att beskriva olika typer av objekt eller hĂ€ndelser. Denna information kan sedan anvĂ€ndas för att kategorisera dessa objekt eller hĂ€ndelser. Om den utvunna informationen kan anvĂ€ndas för att sortera data under ett begrĂ€nsat antal kategorier antyder detta att den innehĂ„ller en generell beskrivning av varje kategori. Inom lĂ€rande system, ett datavetenskapligt omrĂ„de med nĂ€ra anknytning till artificiell intelligens, har man tagit fram metoder som Ă€r sĂ€rskilt anvĂ€ndbara för att automatiskt generera kategoribeskrivningar genom att generalisera frĂ„n redan kategoriserad data. Denna typ av metoder gĂ„r under benĂ€mningen övervakad inlĂ€rning av begrepp. Metoderna kallas allmĂ€nt för inlĂ€rningsalgoritmer och kategoribeskrivningarna som genereras kallas för klassificerare dĂ„ de anvĂ€nds för att klassificera, eller kategorisera data. UtvĂ€rdering av inlĂ€rningsalgoritmer och klassificerare krĂ€vs för att försĂ€kra sig om att det studerade problemet löses tillrĂ€ckligt vĂ€l av en viss metod och Ă€ven för att kunna vĂ€lja en lĂ€mplig inlĂ€rningsalgoritm frĂ„n de mĂ„nga tillgĂ€ngliga. Avhandlingen behandlar kritiska frĂ„gor om utvĂ€rdering av lĂ€rande system och presenterar nya ansatser och mĂ„tt speciellt avsedda för utvĂ€rdering av algoritmer inom övervakad inlĂ€rning av begrepp och klassificerare. InlĂ€rningsalgoritmer utvĂ€rderas typiskt efter hur korrekta de inlĂ€rda klassificerarna Ă€r i sin kategorisering av ny data (data som inte anvĂ€nts för inlĂ€rning och för vilka kategorin inte Ă€r kĂ€nd sedan tidigare för inlĂ€rningsalgoritmen). Korrekthet mĂ€ts genom att man lĂ„ter en klassificerare kategorisera en mĂ€ngd data och sedan dividerar antalet korrekta kategoriseringar med det totala antalet kategoriseringar. Teoretiska, sĂ„vĂ€l som empiriska studier har pĂ„visat flera brister med detta mĂ„tt. Först och frĂ€mst begrĂ€nsas utvĂ€rderingen dĂ„ endast en kvalitetsaspekt granskas. Övervakad inlĂ€rning av begrepp anvĂ€nds inom ett brett spektra av tillĂ€mpningsomrĂ„den (exempelvis: diagnos, bild- och ljudigenkĂ€nning, prediktion) och varje specifik tillĂ€mpning har sin egen uppsĂ€ttning mĂ„l och krav som mĂ„ste uppnĂ„s. Dessutom Ă€r det kĂ€nt att korrekthet som mĂ„tt inte Ă€r speciellt tillförlitligt dĂ„ dess förutsĂ€ttningar, som att mĂ€ngden data skall vara jĂ€mnt fördelad över samtliga kategorier, sĂ€llan uppfylls i skarpa situationer. Flera alternativa mĂ„tt har presenterats tidigare men fĂ„ metoder existerar för att vĂ€lja lĂ€mpliga mĂ„tt givet ett visst problem eller en specifik tillĂ€mpning. Det centrala temat i avhandlingen Ă€r den mĂ„ttbaserade ansatsen, som gĂ„r ut pĂ„ att skrĂ€ddarsy utvĂ€rdering och inlĂ€rning för specifika tillĂ€mpningar genom att systematiskt vĂ€lja passande mĂ„tt baserat pĂ„ en tillĂ€mpnings mĂ„l och krav. Avhandlingen presenterar bland annat en generell metod för viktad utvĂ€rdering av flera mĂ„tt och föreslĂ„r en metod för systematiskt val av mĂ„tt baserat pĂ„ tillĂ€mpningsmĂ„l. Vidare presenterar avhandlingen en metod för mĂ„ttbaserad inlĂ€rning, som gĂ„r ut pĂ„ att fĂ„ en inlĂ€rningsalgoritm att ta hĂ€nsyn till relevanta mĂ„tt redan under inlĂ€rningsfasen. Resultaten visar att denna metod ökar möjligheterna att skapa klassificerare som Ă€r specialanpassade för en viss tillĂ€mpning. Möjligheterna med mĂ„ttbaserad utvĂ€rdering och inlĂ€rning bedöms som stora dĂ„ informationsutvinning i allt högre grad anvĂ€nds inom sĂ„vĂ€l forskning som nĂ€ringsliv och mĂ„len med tillĂ€mpningarna varierar stort. Idag anvĂ€nds informationsutvinningstekniker och övervakad inlĂ€rning av begrepp inom bland annat: medicin (diagnosticering av patienter samt prediktion av vĂ„rdbehov), IT-sĂ€kerhet (intrĂ„ngsdetektion, sortering av skrĂ€ppost, detektion av integritetskrĂ€nkande programvara), video- och bildanalys (igenkĂ€nning via fingeravtryck och ansiktsdrag), automatiserad sprĂ„kförstĂ„else (kategorisering av textdokument, igenkĂ€nning av sprĂ„k)

    Om den MĂ„ttbaserade Ansatsen till Övervakad InlĂ€rning av Begrepp

    No full text
    A classifier is a piece of software that is able to categorize objects for which the class is unknown. The task of automatically generating classifiers by generalizing from examples is an important problem in many practical applications. This problem is often referred to as supervised concept learning, and has been shown to be relevant in e.g. medical diagnosis, speech and handwriting recognition, stock market analysis, and other data mining applications. The main purpose of this thesis is to analyze current approaches to evaluate classifiers as well as supervised concept learners and to explore possible improvements in terms of alternative or complementary approaches. In particular, we investigate the metric-based approach to evaluation as well as how it can be used when learning. Any supervised concept learning algorithm can be viewed as trying to generate a classifier that optimizes a specific, often implicit, metric (this is sometimes also referred to as the inductive bias of the algorithm). In addition, different metrics are suitable for different learning tasks, i.e., the requirements vary between application domains. The idea of metric-based learning is to both make the metric explicit and let it be defined by the user based on the learning task at hand. The thesis contains seven studies, each with its own focus and scope. First, we present an analysis of current evaluation methods and contribute with a formalization of the problems of learning, classification and evaluation. We then present two quality attributes, sensitivity and classification performance, that can be used to evaluate learning algorithms. To demonstrate their usefulness, two metrics for these attributes are defined and used to quantify the impact of parameter tuning and the overall performance. Next, we refine an approach to multi-criteria classifier evaluation, based on the combination of three metrics and present algorithms for calculating these metrics. In the fourth study, we present a new method for multi-criteria evaluation, which is generic in the sense that it only dictates how to combine metrics. The actual choice of metrics is application-specific. The fifth study investigates whether or not the performance according to an arbitrary application-specific metric can be boosted by using that metric as the one that the learning algorithm aims to optimize. The subsequent study presents a novel data mining application for preventing spyware by classifying End User License Agreements. A number of state-of-the-art learning algorithms are compared using the generic multi-criteria method. Finally, in the last study we describe how methods from the area of software engineering can be used to solve the problem of selecting relevant evaluation metrics for the application at hand.Den tekniska utvecklingen har förĂ€ndrat vĂ„r livsstil och utökat den globala ekonomins fokus frĂ„n produktion av varor till insamling och förĂ€dling av information. En konsekvens av denna förĂ€ndring Ă€r att vi blir mer beroende av databaser för lagring och databehandling. Antalet, och speciellt storleken pĂ„, databaserna vĂ€xer snabbt, vilket gör det allt svĂ„rare att extrahera anvĂ€ndbar information. Tekniker och metoder frĂ„n s.k. informationsutvinning (Eng: data mining) har visat sig vĂ€l lĂ€mpade för denna uppgift sĂ„vĂ€l inom industri som inom flera vetenskapliga och tekniska omrĂ„den. Informationsutvinning, eller kunskapsupptĂ€ckande, Ă€r ett tvĂ€rvetenskapligt omrĂ„de med anknytning till artificiell intelligens, statistik, databasteknik och datorsystemteknik. OmrĂ„dets syfte Ă€r att utveckla kunskap och metoder för att kunna extrahera anvĂ€ndbar information frĂ„n stora mĂ€ngder data. En vanligt förekommande uppgift Ă€r att utvinna information som kan anvĂ€ndas för att beskriva olika typer av objekt eller hĂ€ndelser. Denna information kan sedan anvĂ€ndas för att kategorisera dessa objekt eller hĂ€ndelser. Om den utvunna informationen kan anvĂ€ndas för att sortera data under ett begrĂ€nsat antal kategorier antyder detta att den innehĂ„ller en generell beskrivning av varje kategori. Inom lĂ€rande system, ett datavetenskapligt omrĂ„de med nĂ€ra anknytning till artificiell intelligens, har man tagit fram metoder som Ă€r sĂ€rskilt anvĂ€ndbara för att automatiskt generera kategoribeskrivningar genom att generalisera frĂ„n redan kategoriserad data. Denna typ av metoder gĂ„r under benĂ€mningen övervakad inlĂ€rning av begrepp. Metoderna kallas allmĂ€nt för inlĂ€rningsalgoritmer och kategoribeskrivningarna som genereras kallas för klassificerare dĂ„ de anvĂ€nds för att klassificera, eller kategorisera data. UtvĂ€rdering av inlĂ€rningsalgoritmer och klassificerare krĂ€vs för att försĂ€kra sig om att det studerade problemet löses tillrĂ€ckligt vĂ€l av en viss metod och Ă€ven för att kunna vĂ€lja en lĂ€mplig inlĂ€rningsalgoritm frĂ„n de mĂ„nga tillgĂ€ngliga. Avhandlingen behandlar kritiska frĂ„gor om utvĂ€rdering av lĂ€rande system och presenterar nya ansatser och mĂ„tt speciellt avsedda för utvĂ€rdering av algoritmer inom övervakad inlĂ€rning av begrepp och klassificerare. InlĂ€rningsalgoritmer utvĂ€rderas typiskt efter hur korrekta de inlĂ€rda klassificerarna Ă€r i sin kategorisering av ny data (data som inte anvĂ€nts för inlĂ€rning och för vilka kategorin inte Ă€r kĂ€nd sedan tidigare för inlĂ€rningsalgoritmen). Korrekthet mĂ€ts genom att man lĂ„ter en klassificerare kategorisera en mĂ€ngd data och sedan dividerar antalet korrekta kategoriseringar med det totala antalet kategoriseringar. Teoretiska, sĂ„vĂ€l som empiriska studier har pĂ„visat flera brister med detta mĂ„tt. Först och frĂ€mst begrĂ€nsas utvĂ€rderingen dĂ„ endast en kvalitetsaspekt granskas. Övervakad inlĂ€rning av begrepp anvĂ€nds inom ett brett spektra av tillĂ€mpningsomrĂ„den (exempelvis: diagnos, bild- och ljudigenkĂ€nning, prediktion) och varje specifik tillĂ€mpning har sin egen uppsĂ€ttning mĂ„l och krav som mĂ„ste uppnĂ„s. Dessutom Ă€r det kĂ€nt att korrekthet som mĂ„tt inte Ă€r speciellt tillförlitligt dĂ„ dess förutsĂ€ttningar, som att mĂ€ngden data skall vara jĂ€mnt fördelad över samtliga kategorier, sĂ€llan uppfylls i skarpa situationer. Flera alternativa mĂ„tt har presenterats tidigare men fĂ„ metoder existerar för att vĂ€lja lĂ€mpliga mĂ„tt givet ett visst problem eller en specifik tillĂ€mpning. Det centrala temat i avhandlingen Ă€r den mĂ„ttbaserade ansatsen, som gĂ„r ut pĂ„ att skrĂ€ddarsy utvĂ€rdering och inlĂ€rning för specifika tillĂ€mpningar genom att systematiskt vĂ€lja passande mĂ„tt baserat pĂ„ en tillĂ€mpnings mĂ„l och krav. Avhandlingen presenterar bland annat en generell metod för viktad utvĂ€rdering av flera mĂ„tt och föreslĂ„r en metod för systematiskt val av mĂ„tt baserat pĂ„ tillĂ€mpningsmĂ„l. Vidare presenterar avhandlingen en metod för mĂ„ttbaserad inlĂ€rning, som gĂ„r ut pĂ„ att fĂ„ en inlĂ€rningsalgoritm att ta hĂ€nsyn till relevanta mĂ„tt redan under inlĂ€rningsfasen. Resultaten visar att denna metod ökar möjligheterna att skapa klassificerare som Ă€r specialanpassade för en viss tillĂ€mpning. Möjligheterna med mĂ„ttbaserad utvĂ€rdering och inlĂ€rning bedöms som stora dĂ„ informationsutvinning i allt högre grad anvĂ€nds inom sĂ„vĂ€l forskning som nĂ€ringsliv och mĂ„len med tillĂ€mpningarna varierar stort. Idag anvĂ€nds informationsutvinningstekniker och övervakad inlĂ€rning av begrepp inom bland annat: medicin (diagnosticering av patienter samt prediktion av vĂ„rdbehov), IT-sĂ€kerhet (intrĂ„ngsdetektion, sortering av skrĂ€ppost, detektion av integritetskrĂ€nkande programvara), video- och bildanalys (igenkĂ€nning via fingeravtryck och ansiktsdrag), automatiserad sprĂ„kförstĂ„else (kategorisering av textdokument, igenkĂ€nning av sprĂ„k)
    • 

    corecore